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This application is a divisional of U.S. Serial No. 09/645,192, filed August 24, 2000, which 
claims priority to U.S. Serial No. 60/150,488, filed August 24, 1999. Each of these prior 
applications is incorporated by reference in its entirety. 

TECHNICAL FIELD 

The present invention relates generally to the biosynthesis of glycans found as free 
oligosaccharides or covalently bound to proteins and glycolipids. In particular, this invention 
relates to a family of nucleic acids encoding UDP-JV-acetylglucosamine: iV-acetylgalactosamine- 
p 1 ,6-iV-acetylglucosaminyltransferases (Core-P 1 ,6-i\r-acetylglucosaminyltransferases), which 
add AT-acetylglucosamine to the hydroxy group at C6 of 2-acetamido-2-deoxy-D-galactosamine 
(GalNAc) in 0-glycans of the core 1 and the core 3 type thereby foiming the core 2 and core 4 
types. Previously two members of tiiis family have been identified and designated C2GnTl and 
C2GnT2. 

This invention is more particularly related to a gene encoding a third member of this family of O- 
glycan pl,6-Ar-acetylglucosaminyltransferases, termed C2GnT3, probes to the DNA encoding 
C2GnT3, DNA constructs comprising DNA encoding C2GnT3, recombinant plasmids and 
recombinant methods for producing C2GnT3, recombinant methods for stably transforming or 
transfecting cells for expression of C2GnT3, methods for identification of agents with the ability to 
inhibit or stimulate C2GnT3 biological activity, and methods for identification of DNA 
polymorphism in patients. In the U.S.Provisional Patent Application No. 60/150,488 filed on 
August 24, 1999, from which the present application claims priority, this novel Core 2 



p6GlcNAc-transferase isoform was identified and designated C2GnTII. The designation 
C2GnTn has here been replaced by the designation C2GnT3 in accordance with its scientific 
publication (14). 

BACKGROUND OF THE INVENTION 

5 O-linked protein glycosylation involves an initiation stage in which a family of A'- 
acetylgalactosaminyltransferases catalyzes the addition of JV-acetylgalactosamine to Serine or 
Threonine residues (1). Further assembly of 0-glycan chains involves several sucessive or 
alternative biosynthetic reactions: i) formation of simple mucin-type core 1 structures by UDP-Gal: 
13 GalNAca-R pl,3Gal-transferase activity; ii) conversion of core 1 to complex-type core 2 structures 
||o by UDP-GlcNAc: Galal-3GalNAca-R pl,6GlcNAc-transferase activities; iii) direct formation of 
Ji complex mucin-type core 3 by UDP-GlcNAc: GalNAca p l,3GlcNAc-transferase activities; and iv) 
conversion of core 3 to core 4 by UDP-GlcNAc: GlcNAcpl-3GalNAca-R pl,6GlcNAc-transferase 
|l activity. The formation of pl,6GlcNAc branches (reactions ii and iv) may be considered a key 
l^^ controlling event of O-linked protein glycosylation leading to strictures produced upon 
Hs differentiation and malignant ti^sformation (2-6). For example, increased formation of 
P GlcNAc l-6GalNAc branching in 0-glycans has been demonstrated during T-cell activation, 
|| during tiie development of leukemia, and for immunodeficiencies like Wiskott-Aldrich syndrome 
^•^ and AIDS (7; 8). Core 2 branching may play a role in timior progression and metastasis (9). In 
contrast, many carcinomas show changes firom complex 0-glycans found m normal cell types to 
20 immaturely processed simple mucin-lype O-glycans such as T (Thomsen-Friedenreich antigen; 
Galpl-3GalNAcal-R), Tn (GalNAcal-R), and sialosyl-Tn (NeuAca2-6Ga]NAcal-R) (10). The 
molecular basis for this has been extensively stiidied in breast cancer, where it was shown that 
specific downregulation of a core 2 p6GlcNAc-ti:ansferase was responsible for the observed lack 
of complex type O-glycans on the mucin MUCl (6). 0-glycan core assembly may tiierefore be 
25 contioUed by inverse changes in the expression level of Core-pl,6-A^-acetylglucosaminyl- 
transferases and the sialyltransferases forming sialyl-T and sialyl-Tn. 

Interestingly, the metastatic potential of tumors has been correlated with increased expression of 
core 2 p6GlcNAc-ti-ansferase activity (5). The increase in core 2 p6GlcNAc-ti:ansferase activity 
was associated witii increased levels of poly iV-acetyllactosamme chains carrying sialyl-Le^, 
30 which may contribute to tumor metastasis by altering selectin-mediated adhesion (4; 11). The 
control of O-glycan core assembly is regulated by the expression of key enzyme activities; 
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however, epigenetic factors including posttranslational modification, topology, or competition 
for substrates may also play a role in this process (1 1). 

Changes in surface carbohydrates of T-cells have been identified during development and 
activation. 0-glycan branches of the core 2 type are restricted to immature thymocytes of the 
5 thymal cortex but are no longer exposed on the surface of mature medullary thymocytes (17). 
Core 2 structures on T-cell surface proteins are ligands for the S-type lectm gaIectin-1, which 
participates in thymocyte - thymic epithelia mteraction (18). The elunination of Core 2 structures 
from the thymocyte cell surface was found to be essential for controlled apoptosis mediated by 
galectin-1 (19). 

Core 2 p6GlcNAc-transferase isoform was mitially identified as a critical enzyme m blood cell 
|| development and differentiation and designated leukocyte form or L-Form (C2GnT-L)(12). The 
Jj gene encoding C2GnT-L has been cloned by expression cloning from a cDNA library of the 
It human promyelocytic leukemia cell Ime HL-60 (13). This gene has now been renamed as 
:'^5 C2GnTl (14). Using the C2GnTl sequence as a probe for BLAST analysis of the human 
ij expressed sequence tag database, a homologous gene encoding a second Core 2 p6GlcNAc- 
fi transferase isoform has been identified and designated C2/4GnT (15) and C2GnT-M (16). This 
P gene has now been renamed as C2GnT2 (14). 

C2GnTl was predicted to control synthesis of core 2 selectin ligands in leukocytes and lymphoid 
20 tissues, however, mice deficient m C2GnTl exhibited only partial reduction in selectin ligand 
production and no significant changes in lymphocyte homing properties (Ellies, L. G., et al. 
1998, Immunity 9: 881-890). One possible explanation for these results would be the expression 
of additional Core 2 p6GlcNAc-transferases. C2GnT2 does not appear to be a candidate, as its 
expression pattern is restricted to mucous secretmg organs (15,16). 

25 Consequently, there exists a need in tiie art for detecting as yet unidentified UDP-N- 
acetylglucosamine: Galactose- pi, 3 -iV-acetylgalactosamine-a-R (GlcNAc to GalNAc) pl-6 N- 
acetylglucosaminyltransferases and identifying the prunary structures of the genes encodii^ such 
enzymes. The present invention meets this need, and further presents other related advantages. 
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SUMMARY OF THE INVENTION 



The present invention provides isolated nucleic acids encoding human UDP-A'-acetylglucosamine: 
JV-acetylgalactosamine pi, 6 iV-acetylglucosaminyltransferase 3 (C2GnT3), including cDNA and 
genomic DNA. C2GnT3 has acceptor substrate specificities comparable to C2GnTl (14). The 
complete nucleotide sequence encoding C2GnT3 is set forth in SEQ ID NO: 1 and in Figure 1. 

Variations in one or more nucleotides may exist among individuals withm a population due to 
natural allelic variation. Any and all such nucleic acid variations are within the scope of the 
mvention. DNA sequence polymorphisms may also occur which lead to changes in the amino 
acid sequence of a C2GnT3 polypeptide. These amino acid polymorphisms are also within the 
scope of the present invention, hi addition, species variations i.e. variations in nucleotide 
sequence naturally occurring among different species, are within the scope of the mvention. 

Among Core 2 p6GlcNAc-transferases, C2GnT3 appears to be the dominant isoform in thymus 
(14). Thus, C2GnT3 is likely to have important functions during thymocyte development as well 
as T-cell maturation and hommg (14). The identification of agents with the ability to inhibit or 
stimulate C2GnT3 enzymatic activity tiierefore has the potential for botii diagnostic and 
therapeutic purposes of related diseases. 

Access to file gene encoding C2GnT3 allows production of a glycosyltransferase for use in 
formation of core 2-based O-glycan modifications on oligosacccharides, glycoprotems and 
glycosphingolipids. This enzyme can be used, for example, in pharmaceutical or other 
commercial applications tiiat require syntiietic addition of core 2-based 0-glycans to tiiese or 
other substrates, in order to produce appropriately glycosylated glycoconjugates havmg 
particular enzymatic, mmiunogenic, or other biological and/or physical properties. 

In one aspect, the mvention encompasses isolated nucleic acids comprising the nucleotide sequence 
of nucleotides 1-1362 as set forth in Figure 1 or sequence-conservative or fimction-conservative 
variants thereof Also provided are isolated nucleic acids hybridizable with nucleic acids having the 
sequence as set forth m Figure 1 or firagments thereof or sequence-conservative or function- 
conservative variants thereof; preferably, the nucleic acids are hybridizable with C2GnT3 sequences 
under conditions of intermediate stringency, and, most preferably, under conditions of high 
stringency. In one embodiment, tiie DNA sequence encodes the amino acid sequence shown in 
Figure 1, firom methionine (amino acid no. 1) to serine (amino acid no. 453). In another 



embodiment, the DNA sequence encodes an amino acid sequence comprising a sequence from 
proline (no, 39) to serine (no.453) of the amino acid sequence set forth in Figure 1 . 

In a related aspect, the invention provides nucleic acid vectors comprising C2GnT3 DNA 
sequences, including but not limited to those vectors in v^^hich the C2GnT3 DNA sequence is 
operably linked to a transcriptional regulatory element, with or without a polyadenylation sequence. 
Cells comprising these vectors are also provided, incliading without limitation transiently and stably 
expressing cells. Vnuses, including bacteriophages, comprising C2GnT3-derived DNA sequences 
are also provided. The invention also encompasses methods for producii^ C2GnT3 polypeptides. 
Cell-based methods include without limitation those comprising: infroducing into a host cell an 
isolated DNA molecule encoding C2GnT3, or a DNA construct comprising a DNA sequence 
encoding C2GnT3; growing the host cell under conditions suitable for C2GnT3 expressioi^ and 
isolating C2GnT3 produced by the host cell, A method for generating a host cell with de novo stable 
expression of C2GnT3 comprises: introducing into a host cell an isolated DNA molecule encoding 
C2GnT3 or an enzymatically active fragment thereof (such as, for example, a polypeptide 
comprising amino acids 39-453 of the sequence set forth Figure 1), or a DNA construct comprising 
a DNA sequence encoding C2GnT3 or an enzymatically active fragment thereof; selecting and 
growing host cells in an appropriate medium; and identifying stably transfected cells expressing 
C2GnT3. The stably transfected cells may be used for the production of C2GnT3 enzyme for use as 
a catalyst and for recombinant production of peptides or proteins with appropriate glycosylation. 
For example, eukaryotic cells, whether normal or diseased cells, having their glycosylation pattem 
modified by stable transfection as above, or components of such cells, may be used to deliver 
specific glycoforms of glycopeptides and glycoproteins, such as, for example, as immunogens for 
vaccination. 

In yet another aspect, the invention provides isolated C2GnT3 polypeptides, including without 
limitation polypeptides having the sequence set forth in Figure 1, polypeptides having the sequence 
of amino acids 39-453 as set forth in Figure 1, and a fiision polypeptide consisting of at least amino 
acids 39-453 as set forth in Figure 1 fused in frame to a second sequence, which may be any 
sequence tiiat is compatible with retention of C2GnT3 enzymatic activity in tiie fusion polypeptide. 
Suitable second sequences include witiiout limitation those comprising an afiBinity ligand or a 
reactive group. 



In a related aspect, methods are disclosed for the identification of agents with the ability to inhibit 
or stimulate the enzymatic activity of C2GnT3. Assays utiUzing C2GnT3 to screen for potential 
inhibitors or stimulators thereof are encompassed by the invention. Furthermore, methods of 
using C2GnT3 in the structure-based design of inhibitors or sthnulators thereof are also an aspect 
of the invention. Such a design would comprise the steps of determming the three-dimensional 
structure of the C2GnT3 polypeptide, analyzing the three-dimensional structure for the likely 
binding sites of donor and / or acceptor substrates, synthesis of a molecule that incorporates a 
predictive reactive site, and determining the inhibiting or stimulating activity of the molecule. 

In another aspect of the present invention, methods are disclosed for screening for mutations in the 
cocMng region of the C2GnT3 gene using genomic DNA isolated from, e.g., blood cells of patients. 
In one embodiment, the metiiod comprises: isolation of DNA from a patient; PGR amplification of 
the coding exon; DNA sequencing of amplified exon DNA fragments and establishing therefrom 
potential structural defects of the C2GnT3 gene associated wilh disease. 

In accordance with an aspect of the invention there is provided a method of, and products for (i.e. 
kits), diagnosing and monitoring conditions mediated by C2GnT3 by determining, in a biological 
sample, the presence of nucleic acid molecules and polypeptides of the invention. 

Still fiirther the invention provides a method for evaluating a test compound for its ability to 
modulate the biological activity of a C2GnT3 polypeptide of the invention. For example, a 
substance that inhibits or enhances the catalytic activity of a C2GnT3 polypeptide may be 
evaluated. "Modulate" refers to a change or an alteration in the biological activity of a 
polypeptide of the invention. Modulation may be an increase or a decrease in activity, a change 
in characteristics, or any other change in the biological, functional, or immunological properties 
of the polypeptide. 

Compounds which modulate the biological activity of a polypeptide of the invention may also be 
identified using the methods of the invention by comparing the pattem and level of expression of 
a nucleic acid molecule or polypeptide of the invention in biological samples, tissues and cells, 
in the presence, and in the absence of the compounds. 

In an embodiment of the invention a method is provided for screening a compound for 
effectiveness as an antagonist of a polypeptide of the invention, comprising the steps of a) 
contacting a sample containing said polypeptide with a compoxmd, imder conditions wherein 



antagonist activity of said polypeptide can be detected, and b) detecting antagonist activity in the 
sample. 

Methods are also contemplated that identify compounds or substances (e.g. polypeptides), which 
interact with C2GnT3 nucleic acid regulatory sequences (e.g. promoter sequences, enhancer 
5 sequences, negative modulator sequences). 

The nucleic acids, polypeptides, and substances and compounds identified using the methods of 
the invention, may be used to modulate the biological activity of a C2GnT3 polypeptide of the 
invention, and they may be used in the treatment of conditions mediated by C2GnT3 such as 
proliferative diseases including cancer, and thymus-related disorders. Accordingly, the nucleic 
acids, polypeptides, substances and compounds may be formulated into compositions for 
admmistration to individuals suffering from one or more of these conditions. Therefore, tiie 
5I present invention also relates to a composition comprising one or more of a polypeptide, nucleic 
acid molecule, or substance or compound identified using the methods of the invention, and a 
pharmaceutically acceptable carrier, excipient or diluent A method for treating or preventing 
these conditions is also provided comprising administering to a patient in need thereof, a 
composition of the invention. 
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The present invention in another aspect provides means necessary for production of gene-based 
ii therapies directed at the thymus. These therapeutic agents may take the form of polynucleotides 
comprising all or a portion of a nucleic acid of the invention comprising a regulatory sequence of 
20 a C2GnT3 nucleic acid placed in appropriate vectors or delivered to target cells in more direct 
ways. 

Having provided a novel C2GnT3, and nucleic acids encoding same, the invention accordingly 
further provides methods for preparing oligosaccharides. In specific embodiments, the invention 
relates to a method for preparing an oligosaccharide comprising contacting a reaction mixture 
2 5 comprising a donor substrate, and an acceptor substrate in the presence of a C2GnT3 polypeptide 
of the invention. 

In accordance with a further aspect of the invention, there are provided processes for utilizing 
polypeptides or nucleic acid molecules, for in vitro purposes related to scientific research, 
synthesis of DNA, and manufacture of vectors. 
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These and other aspects of the present invention will become evident upon reference to the 
following detailed description and drawings. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 depicts the DNA sequence of the C2GnT3 gene (accession # AF132035) and the predicted 
amino acid sequence of C2GnT3. The amino acid sequence is shown m single letter code. The 
hydrophobic segment representing the putative transmembrane domain is double underlined. Four 
consensus motifs for A^-glycosylation are indicated by asterisks. The location of the primers xised 
for preparation of the expression constructs are indicated by single underlining. 

Figure 2 is an illustration of a sequence comparison between human C2GnT3 (accession # 
AFl 32035; SEQ ID NO: 2), human C2GnT2 (formerly designated C2/4GnT; accession # 
AF038650; SEQ ID NO: 15), human C2GnTl (formerly designated C2GnT-L; accession # 
M97347; SEQ ID NO: 13), and human IGnT (accession # Z19550; SEQ ID NO: 17). Introduced 
gaps are shown as hyphens, and aligned identical residues are boxshaded (black for all 
sequences, dark grey for three sequences, and light grey for two sequences). The putative 
transmembrane domains are boxed The positions of conserved cysteines are indicated by 
asterisks. One conserved iV-glycosylation site is indicated by an open circle. The corresponding 
nucleotide sequences are SEQ ID NO: 1 (C2GnT3), SEQ ID NO: 14 (C2GnT2), SEQ ID NO: 12 
(GlGnTl), and SEQ ID NO: 16 (IGnT). 

Figure 3 depicts Northern blot analyses of healthy himian adult and fetal tissues. Panel A: 
loading pattem for the human mRNA master blot (CLONTECH). Dots in row H contain 100 ng 
(H1-H7) or 500 ng (H8) of control DNA or RNA. Panel B: autoradiogram of master blot 
expression analysis using a ^^P-labeled C2GnT3 probe corresponding to the soluble expression 
fragment of C2GnT3 (base pairs 115-1359). Panel C: A multiple human tissue northem blot 
(MTN n from Clontech) was probed as described for panel B. 

Figure 4 shows a PGR analysis of C2GnT3 expression in human blood cell fractions. PGR 
amplifications with primers specific for human C2GnT3 (C2GnT3) or GAPDH (G3PDH) were 
performed on a normalized human blood cell cDNA panel (MTC from Clontech) for 31 cycles. 
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Figure 5 is a schematic representation of forward and reverse PGR primers that can be used to 
amplify the coding exon of the C2GnT3 gene. The sequences of the primers TSHC119 and 
TSHC123 are also shown. 

DETAILED DESCRIPTION OF THE INVENTION 

All patent applications, patents, and literature references cited in this specification are hereby 
incorporated by reference in their entirety. In the case of conflict, the present description, including 
definitions, is intended to control. 

DEFINITIONS 

1. "Nucleic acid" or "polynucleotide" as used herein refers to purine- and pyrimidine-containing 
polymers of any length, either polyribonucleotides or polydeoxyribonucleotides or mixed polyribo- 
polydeoxyribo nucleotides. This includes single- and double-stranded molecules, i.e., DNA-DNA, 
DNA-RNA and RNA-RNA hybrids, as well as "protein nucleic acids" (PNA) formed by 
conjugating bases to an amino acid backbone. This also mcludes nucleic acids containing modified 
bases (see below). 

2. "Complementary DNA or cDNA" as used herein refers to a DNA molecule or sequence that has 
been enzymatically synthesized from the sequences present in a mRNA template, or a clone of such 
a DNA molecule. A "DNA Construct" is a DNA molecule or a clone of such a molecule, either 
single- or double-stranded, which has been modified to contain segments of DNA that are combined 
and juxtaposed in a manner that would not otherwise exist in nature. By way of non-limiting 
example, a cDNA or DNA which has no introns, i.e., is free from non-coding sequences, is inserted 
adjacent to, or within, exogenous {e.g., heterologous) DNA sequences. 

3. A plasmid or, more generally, a vector or "expression vector", is a DNA construct containing 
genetic information that may provide for its repUcation when inserted into a host cell, A plasmid 
generally contains at least one gene sequence to be expressed in the host cell, as well as sequences 
that facilitate such gene expression, including promoters and transcription initiation sites. It may be 
a linear or closed circular molecule. Inserted coding sequences do not occur naturally in the 
organism from which the vector is derived. 



4. Nucleic acids are "hybridizable" to each other when at least one strand of one nucleic acid can 
anneal to another nucleic acid under defined stringency conditions. Stringency of hybridization is 
determined, e.g., by a) the temperature at which hybridization and/or washing is performed, and b) 
the ionic strength and polarity (e.g., formamide) of the hybridization and washing solutions, as well 
as other parameters. Hybridization requires that the two nucleic acids contain substantially 
complementary sequences; depending on the stringency of hybridization, however, mismatches may 
be tolerated. Typically, hybridization of two sequences at high stringency (such as, for example, in 
an aqueous solution of 0.5X SSC, at 65°C) requires that the sequences exhibit some high degree of 
complementarity over Iheir entire sequence. Conditions of mtermediate stringency (such as, for 
example, an aqueous solution of 2X SSC at 65°C) and low stringency (such as, for example, an 
aqueous solution of 2X SSC at 55°C), require correspondingly less overall complementarily 
between tiie hybridizing sequences. (IX SSC is 0.15 M NaCl, 0.015 M Na citi:ate). 

5. An "isolated" nucleic acid or polypeptide as used herein refers to a component that is removed 
from its original environment (for example, its natural environment if it is naturally occurring). An 
isolated nucleic acid or polypeptide contains less than about 50%, preferably less tiian about 75%, 
and most preferably less than about 90%, of the cellular components with which it was originally 
associated. 

6. A "probe" refers to a nucleic acid that forms a hybrid structure with a sequence in a target region 
due to complementarily of at least one sequence in the probe witii a sequence in the target region. 

7. A nucleic acid that is "derived from" a designated sequence refers to a nucleic acid sequence that 
corresponds to a region of the designated sequence. This encompasses sequences that are 
homologous or complementary to the sequence, as well as "sequence-conservative variants" and 
"function-conservative variants". Sequence-conservative variants are those in which a change of one 
or more nucleotides m a given codon position results in no alteration in the amino acid encoded at 
that position. Function-conservative variants of C2GnT3 are those in which a given amino acid 
residue in the polypeptide has been changed without altering the overall conformation and 
enzymatic activity (including substrate specificity) of the native polypeptide; these changes include, 
but are not limited to, replacement of an ammo acid with one having similar physico-chemical 
properties (such as, for example, acidic, basic, hydrophobic, and the like). 

8. A "donor substrate" is a molecule recognized by, e.g., a Core-pi,6-N- 
acetylglucosaminyltransferase and that confributes an N-acetylglucosaminyl moiety for the transfe- 
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rase reaction. For C2GnT3, a donor substrate is UDP-N-acetylglucosamine. An "acceptor substrate" 
is a molecule, preferably a saccharide or oligosaccharide, that is recognized by, e.g., an iV- 
acetylglucosaminyltransferase and that is the target for the modification catalyzed by the 
transferase, i.e., receives the iV-acetylglucosaminyl moiety. For C2GnT3, acceptor substrates 
5 include wifliout limitation oligosaccharides, glycoproteins, 0-lmked core 1-glycopeptides, and 
glycosphmgolipids comprismg the sequences Galpl-SGalNAc, or GlcNAcpl-SGalNAc. 

9. In accordance with the present invention there may be employed conventional molecular 
biology, microbiology, and recombinant DNA techniques within the skill of the art. Such 
techniques are explained fully m the literature. See for example, Sambrook, Fritsch, Maniatis, 
too Molecular Cloning: A Laboratory Manual, Second Edition (1989) Cold Spring Harbor 
11 Laboratory Press, Cold Spring Harbor, N.Y); DNA Cloning: A Practical Approach, Volumes I 
ii and II (D.N. Glover ed. 1985); Oligonucleotide Synthesis (M.J. Gait ed. 1984); Nucleic Acid 
■jj Hybridization B.D. Hames & S.J. Higgins eds. (1985); Transcription and Translation B.D. 
11 Hames & S.J. Higgins eds (1984); Animal Cell Culture R.I. Freshney, ed. (1986); Immobilized 
5.5 Cells and enzymes IRL Press, (1986); and B. Perbal, A Practical Guide to Molecular Cloning 

II (1984). 

f4. 

mi: 

m 10. The terms "sequence similarity" or "sequence identity" refer to the relationship between two 

0 or more amino acid or nucleic acid sequences, determined by comparing the sequences, which 
relationship is generally known as "homology". Identity in the art also means the degree of 

20 sequence relatedness between amino acid or nucleic acid sequences, as the case may be, as 
determined by the match between strings of such sequences. Both identity and similarity can be 
readily calculated (Computational Molecular Biology, Lesk, A.M., ed., Oxford University Press 
New York, 1988; Biocomputing: Informatics and Genome Projects, Smith, D.W. ed., Academic 
Press, New York, 1993; Computer Analysis of Sequence Data, Part I, Griffin, A.M., and Griffin, 

25 H.G. eds, Humana Press, New Jersey, 1994; Sequence Analysis in Molecular Biology, von 
Heinje, G., Academic Press, New York, 1987; and Sequence Analysis Primer, Gribskov, M. and 
Devereux, S., eds. M. Stockton Press, New York, 1991). While there are a number of existing 
methods to measure identity and similarity between two amino acid sequences or two nucleic 
acid sequences, both terms are well known to the skilled artisan (Sequence Analysis in 
30 Molecular Biology, von Hinge, G., Academic Press, New York, 1987; Sequence Analysis 
Primer, Gribskov, M. and Devereux, J., eds. M. Stockton Press, New York, 1991 ; and Carillo, H., 
and Lipman, D. SIAM J. Applied Math., 48.1073, 1988). Preferred methods for determining 



11 



identity are designed to give tlie largest match between the sequences tested. Methods to 
determine identity are codified in computer programs. Preferred computer program methods for 
determining identity and similarity between two sequences include but are not limited to the 
GCG program package (20), BLASTP, BLASTN, and FASTA (21). Identity or similarity may 
also be determined using the alignment algorithm of Dayhoflf et al. (Methods in Enzymology 91: 
524-545 (1983)]. 

Preferably the nucleic acids of the present invention have substantial sequence identity using the 
preferred computer programs cited herein, for example greater than 40%, 45%, 50%, 60%, 70%, 
75%, 80%, 85%, or 90% identity; more preferably at least 95%, 96%, 97%, 98%, or 99% 
sequence identity to the sequence shown in SEQ ID NO: 1 and Figure 1 . 

11. The polypeptides of the invention also include homologs of a C2GnT3 polypeptide and/or 
truncations thereof as described herein. Such homologs include polypeptides whose amino acid 
sequences are comprised of the ammo acid sequences of C2GnT3 polypeptide regions from otiier 
species that hybridize under selected hybridization conditions (see discussion of hybridization 
conditions m particular stringent hybridization conditions herein) with a probe used to obtain a 
C2GnT3 polypeptide or to SEQ ID N0:1. These homologs will generally have the same regions 
which are characteristic of a C2GnT3 polypeptide. It is anticipated that a polypeptide comprising 
an amino acid sequence which has at least 40% identity, at least 45%, or at least 60% similarity, 
preferably at least 60-65% identity or at least 80-85% similarity, more preferably at least 70-80% 
identity or at least 90-95% sunilarity, most preferably at least 95% identity or at least 99% 
similarity with the amino acid sequence shown in SEQ ID NO: 2 and Figures 1 and 2, will be a 
homolog of a C2GnT3 polypeptide. A percent amino acid sequence similarity or identity is 
calculated usmg the methods described herein, preferably the computer programs described 
herein. 

IDENTIFICATION AND CLONING OF C2GnT3 

The present mvention provides the isolated DNA molecules, including genomic DNA and cDNA, 
encoding the UDP-iV-acetylglucosamine: iV-acetylgalactosamine pi,6 N-acetylglucosaminyl- 
transferase 3 (C2GnT3). 
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C2GnT3 was identified by analysis of genomic survey sequences (GSS), and cloned based on a 
genomic clone obtained from a human foreskin fibroblast Ubrary. The cloning strategy may be 
briefly summarized as follows: 1) isolation and sequencing of GSS clone CIT-HSP-2288B17.TF 
(GSS GenBank accession number AQ005888); 2) synthesis of oUgonucleotides derived fi:om GSS 
5 sequence information, designated TSHC96 and TSHClOl ; 3) identification, cloning and sequencmg 
of genomic PI clone GS22597 #844/Bl; 4) identification of a novel cDNA sequence correspondmg 
to C2GnT3; 5) confirmatory sequencing of a cDNA clone obtained by reverse-transcription- 
polymerase chain reaction (RT-PCR) using human thymus poly A-mRNA; 6) construction of 
expression constructs; 7) expression of the cDNA encoding C2GnT3 in Sf9 (Spodoptera 
10 frugiperda) cells. More specifically, the isolation of a representative DNA molecule encoding a 
I* novel third member of the mammalian UDP-7V-acetylglucosamme: (J-A'-actylgalactosamine pl,6-iV- 
It acetylglucosaminyltransferase family involved the following procedures described below. 



Identification of DNA Homologous to C2/4GnT (C2GnT2) 



r Database searches were performed with the coding sequence of the human C2/4GnT (C2GnT2) 
Is sequence (13) using the BLASTn and the tBLASTn algorithm with the GSS database at The 
|j National Center for Biotechnology hifoimation, USA. The BLASTn algorithm was used to identify 
IJ clones representing the query gene (identities of ^ 95%), whereas tBLASTn was used to identify 
IJ non-identical, but similar GSS sequences. GSSs witii 50-90% nucleotide sequence identity were 

regarded as different fi-om Ihe query sequence. Two GSS clones witii several apparent short 
20 sequence motifs and cysteine residues arranged with similar spacing were selected for fijrther 

sequence analysis. 

Cloning of Human C2GnT3 

GSS clone CIT-HSP-2288B17.TF (GSS GenBank accession number AQ005888), derived fi:om a 
putative homologue to C2/4GnT (C2GnT2), was obtained from Research Genetics Inc., USA. 
25 Sequencmg of this clone revealed a partial open reading frame with significant sequence 
similarity to C2/4GnT (C2GnT2). The coding region of human C2GnT-L (C2GnTl), C2/4GnT 
(C2GnT2) and a bovine homologue was previously found to be organized in one exon 
((22),(15)). Since tiie 3' sequence available from the C2GnT3 GSS was incomplete but likely to 
be located in a single exon, tiie missing 3' portion of tiie open readmg fi-ame was obtained by 
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sequencing a genomic PI clone. The PI clone was obtained from a human foreskin genomic PI 
library (DuPont Merck Pharmaceutical Co. Human Foreskin Fibroblast PI Library) by screening 
with the primer pair: 

TSHC96 (5'-GGTTTCACCGTCTCCAACATA-3\ SEQ ID NO: 3) and 
TSHClOl (5'-TCGTAAGGCACCTGATACTT-3', SEQ ID NO: 6). 

One genomic clone for C2GnT35 GS22597 #844/Bl was obtained from Genome Systems Inc. 
DNA from PI phage was prepared as recommended by Genome Systems Inc. The entire coding 
sequence of the C2GnT3 gene was represented in the clone and sequenced in frill using 
automated sequencing (ABB??, Perkin-Elmer). Confirmatory sequencing was performed on a 
cDNA clone obtained by PGR (30 cycles at 95 °C for 10 sec; 55 °C for 15 sec and 68 °C for 2 
min 30 sec) on cDNA from human thymus poly A-mRNA with the sense primer: 

TSHC99 (5'- CGAGGATCCAGAATGAAGATATTCAAATGTTA-3\ SEQ ID NO: 4), 

and the anti-sense primer 

TSHC121 (5'-AGCGAATTCTTACTATCATGATGTGGTAGTG-3', SEQ ID NO: 9). 

The composite sequence contained an open reading frame of 1359 base pairs encoding a putative 
protein of 453 amino acids with type II domain structure predicted by the TMpred-algorithm at 
the Swiss Institute for Experimental Cancer Research (ISREC), 

(http://wwwxh.embnet.org/software/TMPRED_form.html). 
Expression of C2GnT3 

An expression construct designed to encode amino acid residues 39-453 of C2GnT3 was 
prepared by PGR using PI DNA, and the primer pair: 

TSHClOO (5'-CGAGGATCCGCAAAAAGACATTTACTTGGTT -3', SEQ ID NO: 5) and 
TSHC121 (5'-AGCGAATTCTTACTATCATGATGTGGTAGTG-3', SEQ ID NO: 9) 
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with BamRl and EcoRl restriction sites, respectively (Fig. 2). The PGR product was cloned 
between the BamHl and EcoRL sites of pAcGP67A (PharMingen), and the insert was fully 
sequenced. pAcGP67-C2GnT3-sol was co-transfected with Baculo-Gold™ DNA (PharMingen) 
as described previotisly (23). Recombinant Baculovirus was obtained after two successive 
amplifications in S& cells grown in serum-containing medium, and titers of virus were estimated 
by titration in 24-well plates with monitoring of enzyme activities. Transfection of Sf9-cells with 
pAcGP67-C2GnT3-sol resulted in marked increase m GlcNAc-transferase activity compared to 
uninfected cells or cells infected with a control construct. C2GnT3 showed significant activity 
with disaccharide derivatives of 0-linked core 1 (GaIpl-3GalNAcal-R). In contrast, no activity 
was found with core 3 structures (GIcNAcpl-3GalNAcal-R), lacto-A^-^eotetraose as well as 
GlcNAcpi-3Gal-Me as acceptor substrates indicating that C2GnT3 has no Core4GnT and IGnT- 
activity. Additionally, no activity couJd be detected wih a-D-GalNAc-1- j^^ara-nitrophenyl 
indicating that C2GnT3 does not form core 6 (GlcNAcpl-6GalNAcal-R) (Table I). No substrate 
inhibition of enzyme activity was found at high acceptor concentrations up to 20 mM core 1- 
jt7ara-nitrophenyL C2GnT3 shows strict donor substrate specificity for UDP-GlcNAc, no activity 
could be detected with UDP-Gal or UDP-GalNAc (data not shown). 

Table I: Substrate specificities of C2GnT3 and C2GnTl 



C2GnT3 ^ C2GnTl 



Substrate 


2mM 


10 mM 


2mM 


10 mM 




nmol/h/ mg 


nmol /h/ mg 


p-D-Gal-(l-3)-a-D-GalNAc 


6.6 


14.3 


9.6 


19.0 


p-D-Gal-(l -3)-a-D-GalNAc-l-;7-Nph 


18.1 


26.1 


16.2 


23.6 


p-D-GlcNAc-(l-3)-a-D-GalNAc-l-p-Nph 


<0.1 


<0.1 


<0.1 


<0.1 


a-D-GalNAc- 1 -;7-Nph 


<0.1 


<0.1 


<0.1 


<0.1 


D-GdNAc 


<0.1 


<0.1 


<0.1 


<0.1 


lacto-JV-neo-tetraose 


<0.1 


<0.1 


<0.1 


<0.1 


P-D-GlcNAc-(l-3)-p-D-Gal-l-Me 


<0.1 


<0.1 


<0.1 


<0.1 



^En2yme sources were partially purified media of infected High Five™ cells (see "Experimental 
Proeedvires"). Background values obtained with uninfected cells or cells infected with an 
irrelevant construct were subtracted. ^ Me, methyl; Nph, nitrophenyl. 

Controls included the pAcGP67-GalNAc-T3-sol (24). The kinetic properties were determined 
with partially purified enzymes expressed in High Five™ cells. Partial purification was 
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performed by consecutive chromatography on Amberlite IRA-95, DEAE-Sephacryl and SP- 
Sepharose essentially as described (25; 25). 

Northern Blot analysis of Human Organs 

A human RNA master blot containing mRNA from fifty healthy human adult and fetal organs 
(CLONTECH) and a human multiple tissue northern blot (MTNII from CLONTECH) were probed 
with a "^^P-labeled probe corresponding to the soluble fragment of C2GnT3 (base pairs 115-1359). 
The autoradiographic analyses showed expression of C2GnT3 predominantly in lymphoid organs 
and in organs of the gastrointestinal tract with high transcription levels observed in thymus, and 
lower levels in PBLs, lymph node, stomach, pancreas and small intestine (Fig. 3 A and 3B). The size 
of the single transcript was approximately 5.5 kilobases, which correlates to the transcript size of 5.4 
kilobases of the biggest of three transcripts of human C2GnTl (Fig. 3C). Multiple transcripts of 
G2GnTl have been suggested to be caused by differential usage of polyadenylation signals, which 
affects the length of the 3' UTR (13). 

The C2GnT3 enzyme of the present invention was shown to exhibit O-glycosylation capacity 
implying that the C2GnT3 gene is vital for correct/ftill (9-glycosylation in vivo as well. A structural 
defect in the C2GnT3 gene leading to a deficient enzyme or completely defective enzyme would 
therefore expose a cell or an organism to protein/peptide sequences which were not covered by O- 
glycosylation as seen in cells or organisms with intact C2GnT3 gene. Described in Example 5 
below is a method for scanning the coding exon for potential structural defects. Similar methods 
could be used for the characterization of defects in the non-coding region of the C2GnT3 gene 
including the promoter region. 

DNAy Vectors, and Host Cells 

In practicing the present invention, many conventional techniques in molecular biology, 
microbiology, recombinant DNA, and immunology, are used. Such techniques are well known and 
are explained fiilly in, for example, Sambrook et al., 1989, Molecular Cloning A Laboratory 
Manual, Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York; 
DNA Cloning: A Practical Approach, Volumes I and II, 1985 (D.N. Glover ed.); Oligonucleotide 
Synthesis, 1984, (M.L. Gait ed); Nucleic Acid Hybridization, 1985, (Hames and Higgins); 
Transcription and Translation, 1984 (Hames and Higgins eds.); Animal Cell Culture, 1986 (R.L 



Freshney ed.); Immobilized Cells and Enzymes , 1986 (IRL Press); Perbal, 1984, A Practical Guide 
to Molecular Cloning; the series, Methods in Enzymology (Academic Press, Inc.); Gene Transfer 
Vectors for Mammalian Cells, 1987 (J. H. Miller and M. P. Calos eds.. Cold Spring Harbor 
Laboratory); Methods in Enzymology Vol. 154 and Vol. 155 (Wu and Grossman, and Wu, eds., 
5 respectively); Immunochemical Methods in Cell and Molecular Biology, 1987 (Mayer and Waler, 
eds; Academic Press, London); Scopes, 1987, Protein Purification: Principles and Practice, Second 
Edition (Springer- Verlag, N.Y.) and Handbook of Experimental Immunology, 1986, Volumes I-IV 
(Weir and Blackwell eds.). 

The invention encompasses isolated nucleic acid fragments comprising all or part of the nucleic acid 
14) sequence disclosed herein as set forth in Figure 1 . The fragments are at least about 8 nucleotides in 
II length, preferably at least about 12 nucleotides in length, and most preferably at least about 15-20 
|i nucleotides in length. The invention further encompasses isolated nucleic acids compnsmg 
4| sequences that are hybridizable under stringency conditions of 2X SSC, 55 °C, to the sequence set 
fi forth in Figure 1; preferably, the nucleic acids are hybridizable at 2X SSC, 65 °C; and most 
15 preferably, are hybridizable at 0.5X SSC, 65 °C. 

flj The nucleic acids may be isolated directly from cells. Alternatively, the polymerase chain reaction 
If (PCR) method can be used to produce the nucleic acids of the invention, using either chemically 
15 synthesized strands or genomic material as templates. Primers used for PCR can be synthesized 

using the sequence information provided herein and can flirther be designed to introduce appropriate 
20 new restriction sites, if desirable, to facilitate uicorporation uito a given vector for recombinant 

expression. 

The nucleic acids of the present invention may be flanked by natural human regulatory sequences, 
or may be associated with heterologous sequences, including transcriptional control elements such 
as promoters, enhancers, and response elements, or other sequences such as signal sequences, 

2 5 polyadenylation sequences, introns, 5'- and 3'- noncoding regions, and the like. Preferably, although 
not necessarily, any two nucleotide sequences to be expressed as a fusion polypeptide are inserted 
in-fi^e. The nucleic acids may also be modified by many means known in the art. Non-limiting 
examples of such modifications include methylation, "caps", substitution of one or more of the 
naturally occurring nucleotides with an analog, intemucleotide modifications such as, for example, 

30 those with uncharged Unkages (e.g., methyl phosphonates, phosphotriesters, phosphoroamidates, 
carbamates, etc.) and witii charged linkages (e.g., phosphorothioates, phosphorodithioates, etc.). 
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Nucleic acids may contain one or more additional covalently linked moieties, such as, for example, 
proteins (e.g., nucleases, toxins, antibodies, signal peptides, poly-L-lysine, etc.), intercalators (e.g., 
acridine, psoralen, etc.), chelators (e.g., metals, radioactive metals, iron, oxidative metals, etc.), and 
alkylators. The nucleic acid may be derivatized by formation of a methyl or ethyl phosphotriester or 
5 an alkyl phosphoiamidate linkage. Furthermore, the nucleic acid sequences of the present invention 
may also be modified with a label capable of providing a delectable signal, either directly or 
indirectly. Exemplary labels include radioisotopes, fluorescent molecules, biotin, and the like. 

According to the present invention, useful probes comprise a probe sequence at least eight 
nucleotides in length that consists of all or part of the sequence from among the sequences as set 

|M) forth in Figure 1 or sequence-conservative or function-conservative variants thereof, or a 

2 complement thereof, and that has been labelled as described above. 

4i The invention also provides nucleic acid vectors comprising the disclosed sequence or derivatives or 
ll fragments thereof A large number of vectors, including plasmid and fimgal vectors, have been 
f V described for replication and/or expression in a variety of eukaryotic and prokaryotic hosts, and may 
|a5 be used for gene therapy as well as for simple cloning or protein expression. 

Recombinant cloning vectors will often include one or more replication systems for clonmg or 
expression, one or more markers for selection in the host, e.g. antibiotic resistance, and one or more 
expression cassettes. The inserted coding sequences may be synthesized by standard methods, 
isolated from natural sources, or prepared as hybrids, etc. Ligation of the coding sequences to 
2 0 transcriptional regulatory elements and/or to other amino acid coding sequences may be achieved 
by known methods. Suitable host cells may be transformed/tiransfected/infected as appropriate by 
any suitable method including electroporation, CaCb mediated DNA uptake, fungal infection, 
microinjection, microprojectile, or other established methods. 

Appropriate host cells included bacteria, archaebacteria, ftingi, especially yeast, and plant and 
25 animal cells, especially mammalian cells. Also included are avian and insect cells. Of particular 

interest are Saccharomyces cerevisiae, Schizosaccharomyces pombe, Pichia pastoris, Hamemla 
polymorpha, Neurospora spec, SF9 cells, C129 cells, 293 cells, and CHO cells, COS cells, HeLa 
cells, and immortalized mammalian myeloid and lymphoid cell lines. Preferred replication systems 
include Ml 3, ColEl, 2\i, ARS, SV40, baculovirus, lambda, adenovirus, and the like. A large 
30 number of transcription initiation and termination regulatory regions have been isolated and shown 
to be effective in the transcription and translation of heterologous proteins in the various hosts. 
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Examples of these regions, metiiods of isolation, manner of manipulation, etc. are known in the art. 
Under appropriate expression conditions, host cells can be used as a source of recombinantly 
produced C2GnT3 derived peptides and polypeptides. 

Advantageoxisly, vectors may also include a transcription regulatory element (i.e., a promoter) 
operably linked to the C2GnT3 coding portion. The promoter may optionally contain operator 
portions and/or ribosome binding sites. Non-limiting examples of bacterial promoters compatible 
with E, coll include: p-lactamase (penicillinase) promoter; lactose promoter; tryptophan (trp) 
promoter; arabinose BAD operon promoter; lambda-derived Pi promoter and N gene ribosome 
bindmg site; and tiie hybrid tac promoter derived from sequences of the trp and lac UV5 promoters. 
Non-limiting examples of yeast promoters include 3-phosphoglycerate kinase promoter, 
glyceraldehyde-3 phosphate dehydrogenase (GAPDH) promoter, galactokmase (GALl) promoter, 
galactoepimerase (GALIO) promoter, metallotiiioneine (CUP) promoter and alcohol dehydrogenase 
(ADH) promoter. Suitable promoters for mammalian cells include without limitation viral 
promoters such as that from Simian Virus 40 (SV40), Rous sarcoma viras (RSV), adenovirus 
(ADV), and bovine papilloma virus (BPV). Mammalian cells may also require terminator sequences 
and poly A addition sequences and enhancer sequences which increase expression may also be 
included; sequences which cause amplification of the gene may also be desirable. Furthermore, 
sequences that facilitate secretion of the recombinant product from cells, including, but not limited 
to, bacteria, yeast, and animal cells, such as secretory signal sequences and/or prohormone pro 
region sequences, may also be included. These sequences are known in the art. 

Nucleic acids encoding wild type or variant polypeptides may also be introduced into cells by 
recombination events. For example, such a sequence can be introduced into a cell, and thereby 
effect homologous recombination at the site of an endogenous gene or a sequence with substantial 
identity to the gene. Other recombination-based methods such as nonhomologous recombinations or 
deletion of endogenous genes by homologous recombination may also be used. 

The nucleic acids of the present invention find use, for example, as probes for the detection of 
C2GnT3 in other species or related organisms and as templates for the recombinant production of 
peptides or polypeptides. These and other embodiments of the present invention are described in 
more detail below. 



Polypeptides and Antibodies 



The present invention encompasses isolated peptides and polypeptides encoded by the disclosed 
cDNA sequence. Peptides are preferably at least five residues in length. 

Nucleic acids comprising protein-coding sequences can be used to direct the recombinant 
5 expression of polypeptides in intact cells or in cell-free translation systems. The known genetic 
code, tailored if desired for more efficient expression in a given host organism, can be used to 
synthesize oligonucleotides encoding the desired amino acid sequences. The phosphoramidite solid 
support method of (26), the method of (27), or other well known methods can be used for such 
^ synthesis. The resulting oligonucleotides can be inserted into an appropriate vector and expressed in 

^PO a compatible host organism. 

%^ ■ 

Jl The polypeptides of the present invention, including function-conservative variants of the disclosed 

4^ sequence, may be isolated from native or from heterologous organisms or cells (including, but not 

!|f limited to, bacteria, fungi, insect, plant, and mammalian cells) into which a protein-coding sequence 
has been introduced and expressed. Furthermore, the polypeptides may be part of recombinant 

ills fusion proteins. 
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Methods for polypeptide purification are well known in the art, including, without limitation, 
|y preparative discontiuous gel elctrophoresis, isoelectric focusmg, HPLC, reversed-phase HPLC, gel 
filtration, ion exchange and partition chromatography, and countercurrent distribution. For some 
purposes, it is preferable to produce the polypeptide in a recombinant system in which the protein 
20 contains an additional sequence tag that facilitates purification, such as, but not limited to, an 
affinity ligand, reactive group, and/or a polyhistidine sequence. The polypeptide can then be 
purified from a crude lysate of the host cell by chromatography on an appropriate solid-phase 
matrix. Alternatively, antibodies produced against a protein or against peptides derived therefrom 
can be used as purification reagents. Other purification methods are possible. 

25 The present invention also encompasses derivatives and homologues of polypeptides. For some 
pxirposes, nucleic acid sequences encoding the peptides may be altered by substitutions, additions, 
or deletions that provide for functionally equivalent molecules, i.e., function-conservative variants. 
For example, one or more amino acid residues within the sequence can be substituted by another 
amino acid of similar properties, such as, for example, positively charged amino acids (arginine. 
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lysine, and histidine); negatively charged amino acids (aspartate and glutamate); polar neutral amino 
acids; and non-polar amino acids. 

The isolated polypeptides may be modified by^ for example, phosphorylation, sulfation, acylation, 
or other protein modifications. They may also be modified with a label capable of providing a 
detectable signal, either directly or indirectly, including, but not limited to, radioisotopes and 
fluorescent compounds. 

The present invention encompasses antibodies that specifically recognize immunogenic components 
derived from C2GnT3. Such antibodies can be used as reagents for detection and purification of 
C2GnT3. 

C2GnT3 specific antibodies according to the present invention include polyclonal and monoclonal 
antibodies. The antibodies may be elicited in an animal host by immunization with C2GnT3 
components or may be formed by in vitro immunization of immune cells. The immunogenic 
components used to eUcit the antibodies may be isolated from human cells or produced in 
recombinant systems. The antibodies may also be produced in recombinant systems programmed 
with appropriate antibody-encoding DNA. Alternatively, the antibodies may be constructed by 
biochemical reconstitution of purified heavy and hght chains. The antibodies include hybrid 
antibodies (i.e., containing two sets of heavy chain/Ught chain combinations, each of which 
recognizes a different antigen), chimeric antibodies (i.e., in which either the heavy chains, Ught 
chains, or both, are fusion proteins), and univalent antibodies (i.e., comprised of a heavy chain/light 
chain complex bound to the constant region of a second heavy chain). Also included are Fab 
fi-agments, including Fab' and F(ab)2 fi:agments of antibodies. Methods for the production of all of 
the above types of antibodies and derivatives are well known in the art. For example, techniques for 
producing and processing polyclonal antisera are disclosed in Mayer and Walker, 1987, 
Immunochemical Methods in Cell and Molecular Biology^ (Academic Press, London). 

The antibodies of this invention can be purified by standard methods, including but not limited to 
preparative disc-gel elctrophoresis, isoelectric focusing, HPLC, reversed-phase HPLC, gel filtration, 
ion exchange and partition chromatography, and countercurrent distribution. Purification methods 
for antibodies are disclosed, e.g., in The Art of Antibody Purification^ 1989, Amicon Division, W.R, 
Grace & Co. General protein purification methods are described in Protein Purification: Principles 
and Practice, R.K. Scopes, Ed., 1987, Springer-Verlag, New York, NY. 
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Anti C2GnT3 antibodies, whether unlabeled or labeled by standard methods, can be used as the 
basis for immunoassays. The particular label used will depend upon the type of immunoassay used. 

32 125 3 

Examples of labels that can be used include, but are not limited to, radiolabels such as P, I, H 
and ^"^C; fluorescent labels such as fluorescein and its derivatives, rhodamine and its derivatives, 
dansyl and umbelliferone; chemiluminescers such as luciferia and 2,3-dihydrophtbalazinediones; 
and enzymes such as horseradish peroxidase, alkalme phosphatase, lysozyme and glucose-6- 
phosphate dehydrogenase. 

The antibodies can be tagged with such labels by known methods. For example, coupling agents 
such as aldehydes, carbodiimides, dimaleimide, imidates, succinimides, bisdiazotizedbenzadine and 
the like may be used to tag the antibodies with fluorescent, chemiluminescent or enzyme labels. The 
general methods mvolved are well known in the art and are described in, e.g., Chan (Ed.), 1987, 
Immunoassay: A Practical Guide, Academic Press, Inc., Orlando, FL. 

APPLICATIONS OF THE NUCLEIC ACID MOLECULES, POLYPEPTIDES, AND 
ANTIBODIES OF THE INVENTION 

The nucleic acid molecules, C2GnT3 polypeptide, and antibodies of the invention may be used 
in the prognostic and diagnostic evaluation of conditions associated with altered expression or 
activity of a polypeptide of the invention or conditions requiring modulation of a nucleic acid or 
polypeptide of tiie invention mcluding tiiymus-related disorders and proliferative disorders (e.g. 
cancer), and the identification of subjects witii a predisposition to such conditions (See below). 
Metiiods for detecting nucleic acid molecules and polypeptides of the invention can be used to 
monitor such conditions by detecting and localizing the polypeptides and nucleic acids. It would 
also be apparent to one skilled in tiie art that Hie metiiods described herein may be used to shidy 
tiie developmental expression of the polypeptides of the invention and, accordingly, will provide 
further insight into the role of tiie polypeptides. The appUcations of tiie present mvention also 
include methods for the identification of substances or compounds that modulate tbe biological 
activity of a polypeptide of tiie invention (See below). The substances, compounds, antibodies 
etc., may be used for the treatment of conditions requiring modulation of polypeptides of tiie 
invention (See below). 
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Diagnostic Methods 



A variety of methods can be employed for the diagnostic and prognostic evaluation of conditions 
requiring modulation of a nucleic acid or polypeptide of the invention (e.g. thymus-related 
disorders, and cancer), and the identification of subjects with a predisposition to such conditions. 
Such methods may, for example, utilize nucleic acids of the invention, and fragments thereof, 
and antibodies directed against polypeptides of the invention, including peptide fragments. Li 
particular, the nucleic acids and antibodies may be used, for example, for: (1) the detection of the 
presence of C2GnT3 mutations, or the detection of eillier over- or under-expression of C2GnT3 
mRNA relative to a non-disorder state or the qualitative or quantitative detection of alternatively 
spliced forms of C2GnT3 transcripts which may correlate with certain conditions or 
susceptibility toward such conditions; or (2) the detection of either an over- or an under- 
abundance of a polypeptide of the invention relative to a non-disorder state or the presence Of a 
modified (e.g., less than full length) polypeptide of the invention which correlates with a disorder 
state, or a progression toward a disorder state. 

The methods described herein may be performed by utilizing pre-packaged diagnostic kits 
comprising at least one specific nucleic acid or antibody described herein, which may be 
conveniently used, e.g., in clinical settings, to screen and diagnose patients and to screen and 
identily those individuals exhibiting a predisposition to developing a disorder. 

Nucleic acid-based detection techniques and peptide detection techniques are described below. 
The samples that may be analyzed using the mefliods of the invention include those that are 
known or suspected to express C2GnT3 nucleic acids or contain a polypeptide of the invention. 
The methods may be performed on biological samples including but not limited to cells, lysates 
of cells which have been incubated in cell culture, chromosomes isolated from a cell (e.g. a 
spread of metaphase chromosomes), genomic DNA (in solutions or bound to a solid support such 
as for Southern analysis), RNA (in solution or bound to a solid support such as for northem 
analysis), cDNA (in solution or bound to a solid support), an extract from cells or a tissue, and 
biological fluids such as serum, urine, blood, and CSF. The samples may be derived from a 
patient or a culture. 

Methods for Detection of Nucleic Acid Molecules of the Invention 



The nucleic acid molecules of the invention allow those skilled in the art to construct nucleotide 
probes for use in the detection of nucleic acid sequences of the invention in biological materials. 
Suitable probes include nucleic acid molecules based on nucleic acid sequences encoding at least 
5 sequential amino acids from regions of the C2GnT3 polypeptide (see SEQ ID NO: 1), 
preferably they comprise 15 to 50 nucleotides, more preferably 15 to 40 nucleotides, most 
preferably 15-30 nucleotides. A nucleotide probe may be labelled with a detectable substance 
such as a radioactive label that provides for an adequate signal and has sufficient half-life such as 
^^P, ^H, ^"^C or the like. Other detectable substances that may be used include antigens that are 
recognized by a specific labelled antibody, fluorescent compounds, en2ymes, antibodies specific 
for a labelled antigen, and luminescent compounds. An appropriate label may be selected having 
regard to the rate of hybridization and binding of the probe to the nucleotide to be detected and 
the amount of nucleotide available for hybridization. Labelled probes may be hybridized to 
nucleic acids on solid supports such as nitrocellulose filters or nylon membranes as generally 
described in Sambrook et al, 1989, Molecular Cloning, A Laboratory Manual (2nd ed.). The 
nucleic acid probes may be used to detect C2GnT3 genes, preferably in human cells. The 
nucleotide probes may also be used for example in the diagnosis or prognosis of conditions such 
as tiiymus-related disorders and cancer, and in monitoring the progression of these conditions, or 
monitoring a therapeutic treatment. 

The probe may be used in hybridisation techniques to detect a C2GnT3 gene. The technique 
generally involves contacting and incubating nucleic acids (e.g. recombinant DNA molecules, 
cloned genes) obtained from a sample from a patient or other cellular source with a probe of the 
present invention imder conditions favourable for the specific annealing of the probes to 
complementary sequences in the nucleic acids. Alter incubation, the non-annealed nucleic acids 
are removed, and the presence of nucleic acids that have hybridized to the probe if any are 
detected. 

The detection of nucleic acid molecules of the invention may involve the amplification of 
specific gene sequences using an ampUfication method (e.g. PGR), followed by the analysis of 
the amplified molecules using techniques known to those skilled in the art. Suitable primers can 
be routinely designed by one of skill in the art. For example, primers may be designed using 
commercially available software, such as OLIGO 4.06 Primer Analysis software (National 
Biosciences, Plymouth, Minn.) or another appropriate program, to be about 22 to 30 nucleotides 



in length, to have a GC content of about 50% or more, and to anneal to the template at 
temperatures of about 60 °C to 72 ^C. 

Genomic DNA may be used in hybridization or amplification assays of biological samples to 
detect abnormalities involving C2GnT3 nucleic acid structure, including point mutations, 
insertions, deletions, and chromosomal rearrangements. For example, direct sequencing, single 
stranded conformational polymorphism analyses, heteroduplex analysis, denaturing gradient gel 
electrophoresis, chemical mismatch cleavage, and oligonucleotide hybridization may be utilized. 

Genotyping techniques known to one skilled in the art can be used to type polymorphisms that 
are in close proximity to the mutations in a C2GnT3 gene. The polymorphisms may be used to 
identify individixals in families that are likely to carry mutations. If a polymorphism exhibits 
Unkage disequalibrium with mutations in the G2GnT3 gene, it can also be used to screen for 
mdividuals in the general population likely to carry mutations. Polymorphisms which may be 
used include restriction fragment length polymorphisms (RFLPs), smgle-nucleotide 
polymorphisms (SNP), and simple sequence repeat polymorphisms (SSLPs). 

A probe or primer of the invention may be used to directly identify RFLPs. A probe or primer of 
the invention can additionally be used to isolate genomic clones such as YACs, BACs, PACs, 
cosmids, phage or plasmids. The DNA in the clones can be screened for SSLPs using 
hybridization or sequencing procedures. 

Hybridization and amplification techniques described herein may be used to assay qualitative 
and quantitative aspects of C2GnT3 expression. For example RNA may be isolated fi-om a cell 
type or tissue known to express C2GnT3 and tested utilizing the hybridization (e.g. stand^d 
Northern analyses) or PGR techniques referred to herein. The techniques may be used to detect 
differences in transcript size that may be doe to normal or abnormal alternative splicing. The 
techniques may be used to detect quantitative differences between levels of full length and/or 
alternatively splice transcripts detected in normal individuals relative to those individuals 
exhibiting symptoms of a disease. 

The primers and probes may be used in the above described methods in situ i.e directly on tissue 
sections (fixed and/or frozen) of patient tissue obtained fi-om biopsies or resections. 
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Oligonucleotides or longer fragments derived from any of the nucleic acid molecules of the 
invention may be used as targets in a microarray. The microarray can be used to simultaneously 
monitor the expression levels of large numbers of genes and to identify genetic v^iants, 
mutations, and polymorphisms. The information from the microarray may be used to determine 
5 gene function, to understand the genetic basis of a disorder, to identify predisposition to a 
disorder, to treat a disorder, to diagnose a disorder, and to develop and monitor the activities of 
Hierapeutic agents. 

The preparation, use, and analysis of micro arrays are well known to a person skilled in the art. 
(see, for example, Brennan, T. M., et al. (1995), U.S. Patent No. 5,474,796; Schena et al. (1996), 
Proc. Natl. Acad. Sci. 93:10614-10619; Baldeschweiler et al. (1995), PCT Application 

II W095/251116; Shalon, D., et al. (1995), PCT application WO95/35505; Heller, R. A., et al. 

11 (1997), Proc. Natl. Acad. Sci. 94:2150-2155; and Heller, M. J., et al. (1997), U.S. Patent No. 

f 5,605,662.) 

ft- 

^ Methods for Detecting Polypeptides 

m - 

1% Antibodies specifically reactive with a C2GnT3 Polypeptide, or derivatives, such as enzyme 
p conjugates or labeled derivatives, may be used to detect C2GnT3 polypeptides in various 
11 biological materials. They may be used as diagnostic or prognostic reagents and they may be 
used to detect abnormalities in the level of C2GnT3 polypeptides, expression, or abnormalities in 
the structure, and/or temporal, tissue, cellular, or subcellular location of the polypeptides. 
2 0 Antibodies may also be used to screen potentially therapeutic compounds in vitro to determine 
their effects on a condition such as a thymus-related disorder or cancer. In vitro immunoassays 
may also be used to assess or monitor the efficacy of particular therapies. Preferably, antibodies 
for use in a detection assay have a dissociation constant lower than 1 ^iM, even more preferably 
lower than or about 1 0 nM. 

25 The antibodies of the invention may also be used in vitro to determine the level of C2GnT3 
polypeptide expression in cells genetically engineered to produce a C2GnT3 polypeptide. The 
antibodies may be used to detect and quantify polypeptides of the invention in a sample m order 
to determine their role in particular cellular events or pathological states, and to diagnose and 
treat such pathological states. 
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In particular, the antibodies of the invention may be used in immuno-histochemical analyses, for 
example, at the cellular and sub-subcellul^ level, to detect a polypeptide of the invention, to 
localize it to particular cells and tissues, and to specific subcellular locations, and to quantitate 
the level of expression. 

The antibodies may be used in any known immunoassays that rely on the binding interaction)) 
between an antigenic determinant of a polypeptide of flie invention, and the antibodies. 
Examples of such assays are radio immunoassays, enzyme inmiunoassays (e.g. ELISA), 
immunofluorescence, immunoprecipitation, latex agglutination, hemagglutination, and 
histochemical tests, 

Cytochemical techniques known in the art for localizing antigens using light and electron 
microscopy may be used to detect a polypeptide of the invention. Generally, an antibody of the 
invention may be labelled with a detectable substance and a polypeptide may be localised in 
tissues and cells based upon the presence of the detectable substance. Various methods of 
labelling polypeptides are known in the art and may be med. Examples of detectable substances 
include, but are not limited to, the following: radioisotopes (e.g., ^H, ^"^C, ^^S, ^^^I), 
fluorescent labels (e.g., FITC, Rhodamine, lanthanide phosphors), luminescent labels such as 
luminol, enzymatic labels (e.g., horseradish peroxidase, p-galactosidase, luciferase, alkaline 
phosphatase, acetylcholinesterase), biotinyl groups (which can be detected by marked avidin e.g., 
streptavidin containing a fluorescent marker or enzymatic activity that can be detected by optical 
or calorimetric methods), predetermined polypeptide epitopes recognized by a secondary 
reporter (e.g., leucine zipper pair sequences, binding sites for secondary antibodies, metal 
binding domains, epitope tags). In some embodiments, labels are attached via spacer arms of 
various lengths to reduce potential steric hindrance. Antibodies may also be coupled to electron 
dense substances, such as ferritin or colloidal gold, which are readily visualised by electron 
microscopy. 

The antibody or sample may be immobilized on a carrier or solid support which is capable of 
immobilizing cells, antibodies, etc. For example, the carrier or support may be nitrocellulose, or 
glass, polyacrylamides, gabbros, and magnetite. The support material may have any possible 
configuration including spherical (e.g. bead), cylindrical (e.g. inside surface of a test tube or 
well, or the external surface of a rod), or flat (e.g. sheet, test strip). Indirect methods may also be 
employed in which the primary antigen-antibody reaction is amplified by the introduction of a 
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second antibody, having specificity for the antibody reactive against a polypeptide of the 
invention. By way of example, if the antibody having specificity against a polypeptide of the 
invention is a rabbit IgG antibody, the second antibody may be goat anti-rabbit gamma-globulin 
labelled with a detectable substance as described herein. 

Where a radioactive label is used as a detectable substance, a polypeptide of the mvention may 
be localized by radioautography. The results of radioautography may be quantitated by 
determining the density of particles in the radioautographs by various optical methods, or by 
counting the grains. 

A polypeptide of the invention may also be detected by assaying for C2GnT3 activity as 
described herein. For example, a sample may be reacted with an acceptor substrate and a donor 
substrate under conditions where a C2GnT3 polypeptide is capable of transferring the donor 
substrate to the acceptor substrate to produce a donor substrate-acceptor substrate complex. 

Methods for Identifying or Evaluating Substances / Compounds 

The methods described herein are designed to identify substances and compounds that modulate 
the expression or biological activity of a C2GnT3 polypeptide including substances that interfere 
with or enhance the expression or activity of a C2GnT3 polypeptide. 

Substances and compounds identified using the methods of the invention include but are not 
limited to peptides such as soluble peptides including Ig-tailed fusion peptides, members of 
random peptide libraries and combinatorial chemistry-derived molecular libraries made of D- 
and/or L-configuration amino acids, phosphopeptides (including members of random or partially 
degenerate, directed phosphopeptide libraries), antibodies [e.g. polyclonal, monoclonal, 
humanized, anti-idiotypic, chimeric, single chain antibodies, fragments, (e.g. Fab, F(ab)2, and 
Fab expression library fragments, and epitope-binding fragments thereof)], polypeptides, nucleic 
acids, carbohydrates, and small organic or inorganic molecules. A substance or compound may 
be an endogenous physiological compound or it may be a natural or synthetic compound. 

Modulation of a C2GnT3 polypeptide can be evaluated, for instance, by evaluating the 
inhibitory/stimulatory effect of an agent on C2GnT3 biological activity in comparison to a 
control or reference. The control or reference may be, e.g., a predetermined reference value, or 
may be evaluated experimentally. For example, in a cell-based assay where a host cell 
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expressing recombinant C2GnT3 is incubated in a medium containing a potential modulating 
agent, a control or reference may be, e.g., a host cell incubated with an agent having a known 
effect on C2GnT3 expression/activity, a host cell incubated in the same medium without any 
agent, a host cell transfected with a "mock" vector not expressing any C2GnT3 polypeptide, or 
any other suitable control or reference. In a cell-free assay where C2GnT3 polypeptide is 
incubated in a medium containing a potential modulating agent, a control or reference may be, 
for example, medium not contaming C2GnT3 polypeptide, medium not containing any agent, 
mediiam containing a reference polypeptide or agent, or any other suitable control or reference. 

Substances which modulate a C2GnT3 polypeptide can be identified based on their ability to 
associate with a C2GnT3 polypeptide. Therefore, the invention also provides metiiods for 
identifying substances that associate with a C2GnT3 polypeptide. Substances identified using the 
methods of the invention may be isolated, cloned and sequenced using conventional techniques. 
A substance that associates with a polypeptide of Hhe invention may be an agonist or antagonist 
of the biological or immunological activity of a polypeptide of the invention. 

The term "agonist" refers to a molecTile that increases the amoimt of, or prolongs the duration of, 
the activity of the polypeptide. The term "antagonist" refers to a molecule which decreases the 
biological or immunological activity of the polypeptide. Agonists and antagonists may include 
proteins, nucleic acids, carbohydrates, or any other molecules that associate with a polypeptide 
of the invention. 

Substances which can associate with a C2GnT3 polypeptide may be identified by reacting a 
C2GnT3 polypeptide with a test substance which potentially associates with a C2GnT3 
polypeptide, under conditions which permit the association, and removing and/or detecting the 
associated C2GnT3 polypeptide and substance. Substance-polypeptide complexes, free 
substance, or non-complexed polypeptides may be assayed. Conditions which permit the 
formation of substance-polypeptide complexes may be selected having regard to factors such as 
the nature and amounts of the substance and the polypeptide. 

The substance-polypeptide complex, free substance or non-complexes polypeptides may be 
isolated by conventional isolation techniques, for example, salting out, chromatography, 
electrophoresis, gel filtration, fractionation, absorption, polyacrylamide gel electrophoresis, 
agglutination, or combinations thereof To facilitate the assay of the components, antibody 
against a polypeptide of the invention or the substance, or labelled polypeptide, or a labelled 
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substance may be utilized. The antibodies, polypeptides, or substances may be labelled with a 
detectable substance as described above. 

A C2GnT3 polypeptide, or the substance used in the method of the invention may be 
insolubilized. For example, a polypeptide, or substance may be bound to a suitable carrier such 
as agarose, cellulose, dextran, Sephadex, Sepharose, carboxymethyl cellulose polystyrene, filter 
paper, ion-exchange resin, plastic film, plastic tube, glass beads, polyamine-methyl vinyl-ether- 
maleic acid copolymer, amino acid copolymer, ethylene-maleic acid copolymer, nylon, silk, etc. 
The carrier may be in the shape of, for example, a tube, test plate, beads, disc, sphere etc. The 
insolubilized polypeptide or substance may be prepared by reacting the material with a suitable 
insoluble carrier using known chemical or physical methods, for example, cyanogen bromide 
coupling. 

The mvention also contemplates a method for evaluating a compoimd for its ability to modulate 
the biological activity of a polypeptide of the invention, by assaying for an agonist or antagonist 
(i.e. enhancer or inhibitor) of the association of the polypeptide with a substance which interacts 
with the polypeptide (e.g. donor or acceptor substrates or parts thereof). The basic method for 
evaluating if a compound is an agonist or antagonist of the association of a polypeptide of the 
invention and a substance that associates with the polypeptide is to prepare a reaction mixture 
containing the polypeptide and the substance under conditions which permit the formation of 
substance-polypeptide complexes, in the presence of a test compound. The test compound may 
be initially added to the mixture, or may be added subsequent to the addition of the polypeptide 
and substance. Control reaction mixtures without the test compound or with a placebo are also 
prepared. The formation of complexes is detected and the formation of complexes in the control 
reaction but not in the reaction mixture indicates that the test compound interferes with the 
interaction of the polypeptide and substance. The reactions may be carried out in the liquid phase 
or the polypeptide, substance, or test compound may be immobilized as described herein. The 
agent can be selected from compounds, compositions, antibodies or antibody fragments, 
antisense sequences and ribozyme nucleotide sequences for C2GnT3 polypeptide. 

It will be understood that the agonists and antagonists i.e. inhibitors and enhancers, that can be 
assayed using the methods of the invention may act on one or more of the interaction sites an the 
polypeptide or substance including agonist binding sites, competitive antagonist bmding cites, 
non-competitive antagonist binding sites or allosteric sites. 
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The invention also makes it possible to screen for antagonists that inhibit the effects of an agonist 
of the interaction of a polypeptide of the invention with a substance which is capable of 
associating with the polypeptide. Thus, the invention may be used to assay for a compoxmd that 
competes for the smie interacting site of a polypeptide of the invention. 

Substances that modulate a C2GnT3 polypeptide of the invention can be identified based on their 
ability to interfere with or enhance the activity of a C2GnT3 polypeptide. Therefore, the 
invention provides a method for evaluating a compound for its ability to modulate the activity of 
a C2GnT3 polypeptide comprising (a) reacting an acceptor substrate and a donor substrate for a 
C2GnT3 polypeptide in the presence of a test substance; (b) measuring the amount of donor 
substrate transferred to acceptor substrate, and (c) carrying out steps (a) and (b) in the absence of 
the test substance to determine if the substance interferes with or enhances transfer of the sugar 
donor to the acceptor by the C2GnT3 polypeptide. 

Suitable acceptor substrate for use in the methods of the invention are a saccharide, 
oligosaccharides, polysaccharides, polypeptides, glycopolypeptides, or glycolipids which are 
either synthetic with linkers at the reducing end or naturally occuring structures, for example, 
asialo-agalacto-fetuin glycopeptide. Acceptors will generally comprise p-D-galactosyl-l,3-N- 
acetyl-D-galactosaminyl-. 

The donor substrate may be a nucleotide sugar, dolichol-phosphate-sugar or dolichol- 
pyrophosphate-oligosaccharide, for example, uridine diphospho-N-acetylglucosamine (UDP- 
GlcNAc), or derivatives or analogs thereof. The C2GnT3 polypeptide may be obtained from 
natural sources or produced used recombinant methods as described herein. 

The acceptor or donor substrates may be labeled with a detectable substance as described herein, 
and the interaction of the polypeptide of the invention with the acceptor and donor will give rise 
to a detectable change. The detectable change may be colorimetric, photometric, radiometric, 
potentiometric, etc. The activity of C2GnT3 polypeptide of the invention may also be determined 
using methods based on HPLC (Koenderman et al, FEES Lett. 222: 42, 1987) or methods 
employed synthetic oligosaccharide acceptors attached to hydrophobic aglycones (Palcic et al 
Glycoconjugate 5:49, 1988; and Pierce et al, Biochem. Biophys. Res, Comm. 146: 679, 1987). 

The C2GnT3 polypeptide is reacted with the acceptor and donor substrates at a pH and 
temperature effective for the polypeptide to transfer the donor to the acceptor, and where one of 



31 



the components is labeled, to produce a detectable change. It is preferred to use a buffer with the 
acceptor and donor to maintain the pH withm the pH range effective for the polypeptides. The 
buffer, acceptor and donor may be used as an assay composition. Other compounds such as 
EDTA and detergents may be added to the assay composition. 

The reagents suitable for applying the methods of the invention to evaluate compounds that 
modulate a C2GnT3 polypeptide may be packaged into convenient kits providing the necessary 
materials packaged into suitable containers. The kits may also include suitable supports useful in 
performing the methods of the invention. 

Substances that modulate a C2GnT3 polypeptide can also be identified by treating immortalized 
cells which express the polypeptide with a test substance, and comparing the morphology of the 
cells with the morphology of the cells in the absence of the substance and/or with immortalized 
cells which do not express the polypeptide. Examples of immortalized cells that can be used 
include lung epithelial cell Imes such as MvlLu or HEK293 (human embryonal kidney) 
transfected with a vector containing a nucleic acid of the invention. In the absence of an inhibitor 
the cells show signs of morphologic transformation (e.g. fibroblastic morphology, spindle shape 
and pile up; the cells ^e less adhesive to substratum; there is less cell to cell contact in 
monolayer culture; there is reduced growth-factor requirements for survival and proliferation; the 
cells grow in soft-agar of other semi-solid medium; there is a lack of contact inhibition and 
increased apoptosis in low-serum high density cultures; there is enhanced cell motility, and there 
is invasion into extracellular matrix and secretion of proteases). Substances that inhibit one or 
more phenotypes may be considered an inhibitor. 

A substance that inhibits a C2GnT3 polypeptide may be identified by treating a cell which 
expresses the polypeptide with a test substance, and assaying for complex core 2-based 0-linked 
structures (e.g. repeating Gal[P]l-4GlcNAc[p]) associated with the cell. The complex core 2- 
based 0-linked structures can be assayed using a. substance that binds to the structures (e.g. 
antibodies). Cells that have not been treated with the substance or which do not express the 
polypeptide may be employed as controls. 

Substances which inhibit transcription or translation of a C2GnT3 gene may be identified by 
transfecting a cell with an expression vector comprising a recombinant molecule of the 
invention, including a reporter gene, in the presence of a test substance and comparing the level 
of expression of the C2GnT3 polypeptide, or the expression of the polypeptide encoded by the 
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reporter gene with a control cell transfected with the nucleic acid molecule in the absence of the 
substance. The method can be used to identify transcription and translation inhibitors of a 
C2GnT3 gene. 

Compositions and Treatments 

The substances or compounds identified by the methods described herein, polypeptides, nucleic 
acid molecules, and antibodies of the invention may be used for n^odulating the biological 
activity of a C2GnT3 polypeptide, and they may be used in the treatment of conditions mediated 
by a C2GnT3 polypeptide. In particular, they may be used to T-cell development and 
lymphocyte homing and they may be used in the prevention and treatment of thymus-related 
disorders. 

Therefore, the present invention may be useful for diagnosis or treatment of various thymus- 
related disorders in mammals, preferably humans. Such disorders include the following: tumors 
and cancers, hypoactivity, hyperactivity, atrophy, enlargement of the thymus, and the like. Other 
disorders include disregulation of T-lymphocyte selection or activity and would include but not 
be limited to disorders involving autoimmunity, arthritis, leukemias, lymphomas, 
immunosuppression, sepsis, woimd healing, acute and chronic in action, cell mediated immunity, 
humor immunity, TH1/TH2 imbalance, and the like. 

The substances or compounds identified by the methods described herein, antibodies, and 
polypeptides, and nucleic acid molecules of the invention may be useful in the prevention and 
treatment of tumors. Tumor metastasis may be inhibited or prevented by inhibiting the adhesion 
of circulating cancer cells. The substances, compounds, etc. of the invention may be especially 
useful in the treatment of various forms of neoplasia such as leukemias, lymphomas, melanomas, 
adenomas, sarcomas, and carcinomas of solid tissues in patients. In particular the composition 
may be used for treating malignant melanoma, pancreatic cancer, cervico-uterine cancer, cancer 
of the liver, kidney, stomach, lung, rectum, breast, bowel, gastric, thyroid, neck, cervix, salivary 
gland, bile duct, pelvis, mediastinum, urethra, bronchogenic, bladder, esophagus and colon, and 
Kaposi's Sarcoma which is a form of cancer associated with HIV-infected patients with 
Acquired Immime Deficiency Syndrome (AIDS). The substances etc. are particularly useful in 
the prevention and treatment of tumors of the immune system and thymus and the metastases 
derived from these timiors. 
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A substance or compound identified in accordance with the methods described herein, 
antibodies, polypeptides, or nucleic acid molecules of the invention may be used to modulate T- 
cell activation and immunodeficiency due to the Wiskott-Aldrich syndrome or AIDS, or to 
stimulate hematopoietic progenitor cell growth, and/or confer protection against chemotherapy 
and radiation therapy in a subject. 

Accordingly, the substances, antibodies, and compounds may be formulated into pharmaceutical 
compositions for administration to subjects in a biologically compatible form suitable for 
administration in vivo. By biologically compatible form suitable for administration in vivo is 
meant a form of the substmce to be administered in which any toxic effects are outweighed by 
the therapeutic effects. The substances may be administered to living organisms including 
humans, and animals. Administration of a therapeutically active amount of the pharmaceutical 
compositions of the present invention is defined as an amount effective, at dosages and for 
periods of time necessary to achieve the desired result. For example, a therapeutically active 
amount of a substance may vary according to factors such as the disease state, age, sex, and 
weight of the individual, and the ability of antibody to elicit a desired response in the individual. 
Dosage regima may be adjusted to provide the optimum therapeutic response. For example, 
several divided doses may be administeted daily or the dose may be proportionally reduced as 
indicated by the exigencies of the therapeutic situation. 

The active substance may be administered in a convenient manner such as by injection 
(subcutaneous, intravenous, etc.), oral administration, inhalation, transdermal application, or 
rectal administration. Depending on the route of administration, the active substance may be 
coated in a material to protect the compound firom the action of enzymes, acids and other natural 
conditions that may inactivate the compound. 

The compositions described herein can be prepared by methods known per se for the preparation 
of pharmaceutically acceptable compositions which can be administered to subjects, such that an 
effective quantity of the active substance is combined in a mixture with a pharmaceutically 
acceptable vehicle. Suitable vehicles are described, for example, in Remington's Pharmaceutical 
Sciences (Remington's Pharmaceutical Sciences, Mack Publishing Company, Easton, Pa., USA 
1985). On this basis, the compositions include, albeit not exclusively, solutions of the substances 
or compounds in association with one or more pharmaceutically acceptable vehicles or diluents. 
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and contained in buffered solutions with a suitable pH and iso-osmotic with the physiological 
fluids. 



After pharmaceutical compositions have been prepared, they can be placed in an appropriate 
container and labeled for treatment of an indicated condition. For administration of an inhibitor 

5 of a polypeptide of the invention, such labeling would include amount, frequency, and method of 
administration. 

The nucleic acids encoding C2GnT3 polypeptides or any fragment thereof, or antisense 
sequences may be used for therapeutic purposes. Antisense to a nucleic acid molecule encoding a 

6 polypeptide of tiie invention may be med in situations to block the synthesis of the polypeptide. 



||3 In particular, cells may be transformed with sequences complementary to nucleic acid molecules 

ip encoding C2GnT3 polypeptide. Thus, antisense sequences may be used to modulate C2GnT3 

^1 activity or to achieve regulation of gene function. Sense or antisense oligomers or larger 

p fragments, can be designed from various locations along the coding or regulatory regions of 

' sequences encoding a polypeptide of the invention. 

!£ Expression vectors may be derived fi-om reteoviruses, adenoviruses, herpes or vaccinia viruses or 
fi- from various bacterial plasmids for delivery of nucleic acid sequences to the target organ, tissue, 
13 or cells. Vectors that express antisense nucleic acid sequences of C2GnT3 polypeptide can be 

If. s 

constructed using techniques well known to those skilled in the art (see for example, Sambrook, 
Fritsch, Maniatis, Molecular Cloning, A Laboratory Manual, Second Edition (1989) Cold Spring 
2 0 Harbor Laboratory Press, Cold Spring Harbor, N.Y). 

Genes encoding C2CnT3 polypeptide can be turned off by transforming a cell or tissue with 
expression vectors that express high levels of a nucleic acid molecide or fragment thereof which 
encodes a polypeptide of the invention. Such constructs may be used to introduce untranslatable 
sense or mitisense sequences into a cell. Even if they do not integrate into the DNA, the vectors 
25 may continue to transcribe RNA molecules imtil all copies are disabled by endogenous 
nucleases. Transient expression may last for extended periods of time (e.g. a month or more) 
with a non-replicating vector or if appropriate replication elements are part of the vector system. 

Modification of gene expression may be achieved by designing antisense molecules, DNA, 
RNA, or PNA, to the control regions of a C2GnT3 polypeptide gene i.e. the promoters, 
30 enhancers, and introns. Preferably the antisense molecules are oligonucleotides derived from the 




transcription initiation site (e.g. between positions -10 and +10 from the start site). Inhibition can 
also be achieved by using triple-helix base-pairing techniques. Triple helix pairing causes 
inhibition of the ability of the double helix to open sufficiently for the binding of polymerases, 
transcription factors, or regulatory molecules (see Gee J.E. et al (1994) In: Huber, B.E. and B.L 
Carr, Molecular and Immunologic Approaches, Futura Publishing Co., Mt. Kisco, N.Y.). 

Ribozymes, enzymatic RNA molecules, may be used to catalyze the specific cleavage of RNA. 
Ribozyme action involves sequence-specific hybridization of the ribozyme molecule to 
complementary target RNA, followed by endonucleolytic cleav^e. For example, hammerhead 
motif ribozyme molecules may be engineered that can specifically and efficiently catalyze 
endonucleolytic cleavage of sequences encoding a polypeptide of the invention. 

Specific ribosome cleavage sites within any RNA target may be initially identified by scanning 
the target molecule for ribozyme cleavage sites which include the following sequences: GUA, 
GUU, and GUC. Short RNA sequences of between 15 and 20 ribonucleotides corresponding to 
the region of the cleavage site of the taxgot gene may be evalxiated for secondary structural 
features which may render the oligonucleotide inoperable. The suitability of candidate targets 
may be evaluated by testing accessibility to hybridization with complementary oligonucleotides 
using ribonuclease protection assays. 

Therapeutic efficacy and toxicity may be determined by standard phamiaceutical procedures in 
cell cultures or with experimental anunals, such as by calculating the ED50 (the dose 
therapeutically effective in 50% of the population) or LD50 (the dose lethal to 50% of the 
population) statistics. The therapeutic index is the dose ratio of therapeutic to toxic effects and it 
can be expressed as the ED50/LD50 ratio. Pharmaceutical compositions which exhibit large 
therapeutic indices are preferred. 

The invention also provides methods for studying the function of a C2GnT3 polypeptide. Cells, 
tissues, and non-human animals lacking in C2GnT3 expression or partially lackmg in C2GnT3 
expression may be developed using recombinant expression vectors of the invention having 
specific deletion or insertion mutations in a C2GnT3 gene. A recombinant expression vector may 
be used to inactivate or alter the endogenous gene by homologous recombination, and thereby 
create a C2GnT3 deficient cell, tissue or animal. 
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Null alleles may be generated in cells, such as embryonic stem cells by deletion mutation. A 
recombinmt C2GnT3 gene may also be engineered to contain an insertion mutation which 
inactivates C2GnT3. Such a construct may then be introduced into a cell, such as an embryonic 
stem cell, by a technique such as transfection, elcctroporation, injection etc. Cells lacking an 
intact C2GnT3 gene may then be identified, for example by Southem blotting, Northern Blotting 
or by assaying for expression of a polypeptide of the invention usmg the methods described 
herein. Such cells may then be used to generate transgenic non-human animals deficient in 
C2GnT3. Germline transmission of the mutation may be achieved, for example, by aggregating 
the embryonic stem cells with early stage embryos, such as 8 cell embryos, in vitro; transferring 
the resulting blastocysts into recipient females and; generating germline transmission of the 
resulting aggregation chimeras. Such a mutant animal may be used to defme specific cell 
populations, developmental pattems and in vivo processes, normally dependent on C2GnT3 
expression. 

The invention thus provides a transgenic non-human mammal all of whose germ cells and 
somatic eel Is contain a recombinant expression vector that inactivates or alters a gene encoding 
a C2GnT3 polypeptide. Further the invention provides a transgenic non-human mammal, which 
does not express a C2GnT3 polypeptide of the invention. 

A transgenic non-human animal includes but is not limited to mouse, rat, rabbit, sheep, hamster, 
guinea pig, micro-pig, pig, dog, cat, goat, and non-human primate, preferably mouse. 

The invention also provides a transgenic non-human animal assay system which provides a 
model system for testing for an agent that reduces or inhibits a pathology associated with a 
C2GnT3 polypeptide comprising: (a) administering the agent to a transgenic non-human animal 
of the invention; and (b) determining whether said agent reduces or inhibits the pathology in the 
transgenic non-human animal relative to a transgenic non-human animal of step (a) to which the 
agent has not been administered. 

The agent may be usefixl to treat the disorders and conditions discussed herein. The agents may 
also be incorporated in a pharmaceutical composition as described herein. 

A polypeptide of the invention may be used to support the survival, grov^, migration, and/or 
differentiation of cells expressing the polypeptide. Thus, a polypeptide of the invention may be 
used as a supplement to support, for example cells in culture. 
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Methods to Prepare Oligosaccharides 

The invention relates to a method for preparing an oUgosaccharide comprising contacting a 
reaction mixture comprising an activated donor substrate e.g. GlcNAc, and an acceptor substrate 
in the presence of a polypeptide of the invention. 

Examples of acceptor substrates for use in the method for preparing an oligosaccharide are a 
saccharide, oligosaccharides, polysaccharides, glycopeptides, glycopolypeptides, or glycolipids 
which are either synthetic with linkers at the reducing end or naturally occurring structures, for 
example, asialo-agalacto-fetuin glycopeptide. The activated donor substrate is preferably 
GlcNAc which may be part of a nucleotide-sugar, a dolichol-phosphate-sugar, or dolichol- 
pyrophosphate-oligosaccharide. 

In an embodiment of the invention, the oUgosaccharides are prepared on a carrier that is non- 
toxic to a mammal, in particular a human such as a lipid isoprenoid or polyisoprenoid alcohol. 
An example of a suitable carrier is dolichol phosphate. The oligosaccharide may be attached to a 
carrier via a labile bond allovsdng for chemical removal of the oligosaccharide from the lipid 
carrier. In the alternative, the oligosaccharide transferase may be used to transfer the 
oligosaccharide from a lipid carrier to a polypeptide. 

The following examples are intended to further illustrate the invention without limiting its scope. 
EXAMPLE 1 

A: Identification of cDNA homologous to C2GnT3 by analysis of GSS database sequence 
information. 

Database searches were performed with the coding sequence of the human C2/4GnT (C2GnT2) 
sequence using the BLASTn and tBLASTn algorithms against the GSS database at The National 
Center for Biotechnology Information, USA. The BLASTn algorithm was used to identify GSSs 
representing the query gene (identities of > 95%), whereas tBLASTn was used to identify non- 
identical, but similar GSS sequences. GSSs with 50-90% nucleotide sequence identity were regar- 
ded as different from the query sequence. Composites of the sequence information for two GSSs 
were compiled and analysed for sequence similarity to human C2/4GnT (C2GnT2). 
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B: Cloning and sequencing of C2Gn T3 



A GSS clone CIT-HSP-2288B17TF (GSS GenBank accession number AQ005888), derived 
from a putative homologue to C2/4GnT (C2GnT2), was obtained from Research Genetics Inc., 
USA. Sequencing of this clone revealed a partial open reading frame with significant sequence 
similarity to C2/4GnT (C2GnT2). The coding region of human C2GnT-L (C2GnTl), C2/4GnT 
(C2GnT2) and a bovine homologue was previously found to be organized in one exon 
((22),(15)). Since the 3' sequence available from the C2GnT3 GSS was incomplete but likely to 
be located in the single exon, the missing 3' portion of the open reading frame was obtained by 
sequencing a genomic PI clone. The PI clone was obtained from a human foreskin genomic PI 
library (DuPont Merck Pharmaceutical Co. Human Foreskin Fibroblast PI Library) by screening 
with the primer pair: 

TSHC96 (5'-GGTTTCACCGTCTCCAACATA-3', SEQ ID NO: 3) and 
TSHClOl (5'-TCGTAAGGCACCTGATACTT -3% SEQ ID NO: 6). 

One genomic clone for C2GnT3, GS22597 #844/Bl was obtained from Genome Systems Inc., 
USA. DNA from PI phage was prepared as recommended by Genome Systems Inc. The entire 
coding sequence of the C2GnT3 gene was represented in the clone and sequenced in full using 
automated sequencing (ABI377, Perkin-Elmer). Confirmatory sequencing was performed on a 
cDNA clone obtained by PCR (30 cycles at 95^C for 10 sec; 55^C for 15 sec and 68^C for 2 min 
30 sec) on cDNA from human thymus poly A-mRNA with the sense primer: 

TSHC 99 (5'- CGAGGATCCAGAATGAAGATATTCAAATGTTA-3% SEQ ID NO: 4), 
and the anti-sense primer: 

TSHC121 (5'-AGCGAATTCTTACTATCATGATGTGGTAGTG-3', SEQ ID NO: 9). 

The composite sequence contained an open reading fi^me of 1359 base pairs encoding a putative 
protein of 453 amino acids with type n domain structure predicted by the TMpred-algorithm at 
the Swiss Institute for Experimental Cancer Research (ISREC).(http://www.ch.embnet.org/ 
software/TMPRED_form.html). 
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EXAMPLE! 



A: Expresson ofC2GnT3 in Sf9 cells 

An expression vector construct designed to encode amino acid residues 39-453 of C2GnT3 was 
prepared by PGR using PI DNA, and the primer pair: 

TSHClOO (5'-CGAGGATCCGCAAAAAGACATTTACTTGGTT -3', SEQ ID NO: 5) and 

TSHC121 (5'-AGCGAATTCTTACTATCATGATGTGGTAGTG-3\ SEQ ID NO: 9) 

with BamHl and EcdSl restriction sites, respectively (Fig. 2). The PGR product was cloned 
between the BamUl and EcoRL sites of pAcGP67A (PharMingen), and the insert was fully 
sequenced. pAcGP67-C2GnT3-sol was co-transfected with Baculo-Gold™ DNA (PharMingen) 
as described previously (23). Recombinant Baculo-viruses were obtained after two successive 
amplifications in Sf9 cells grown in serum-containing medium, and titers of virus were estimated 
by titration in 24-well plates with monitoring of enzyme activities. Transfection of Sf9-cells with 
pAcGP67-C2GnT3-sol resulted in marked increase in GlcNAc-transferase activity compared to 
uninfected cells or cells infected with a control construct. 

B: Analysis of C2GnT3 activity 

Standard assays were performed using culture supernatant from infected cells in 50 |li1 reaction 
mixtures containing 100 mM MES (pH 6.5), 0.1% Nonidet P-40, 150 |iM UDP-[14c]-GlcNAc 
(2,000 cpm/nmol) (Amersham Pharmacia Biotech), and the indicated concentrations of acceptor 
substrates (Sigma and Toronto Research Laboratories Ltd., see Table I for structures). Reaction 
products were quantified by chromatography on Dowex AG1-X8. 



EXAMPLES 

Restricted organ expression pattern of C2GnT3 

A human RNA master blot (CLONTECH) was used for expression analysis. The cDNA- 
fragment of soluble C2GnT3 was used as a probe for hybridization. The probe was random 
primer-labeled using [a^^pjdATP and and the Strip-EZ DNA labeling kit (Ambion). The 
membrane was probed for 6h at 65''C following the protocol of the manufacturer (CLONTECH) 
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and washed five times for 20 min each at 65 with 2 x SSC, 1% SDS and twice for 20 min 
each at 55 °C with 0.1 x SSC, 0.5 % SDS. A human multiple tissue Northem blot MTN II 
(CLONTECH), was probed as described (24), and washed twice for 10 mui each at room 
temperature with 2 x SSC, 0.1% SDS; twice for 10 min each at 55 °C with 1 x SSC, 0.1 % SDS; 
and once for 10 min with 0.1 x SSC, 0.1 % SDS at 55 °C. 

EXAMPLE 4 

Analysis of C2GnT3 gene expression in peripheral blood mononuclear cells 

PGR analysis of C2GnT3 expression in resting and activated human blood cell fractions was 
performed using the primer pair: 

TSHC118 (5'-GAGTCAGTGTGGAATTGAATAC-3', SEQ ID NO: 7) and 
TSHC126 (5'-CAACAGTCTCCTCAACCCTG-3% SEQ ID NO: 11). 

PGR amplifications with primers specific for human G2GnT3 (G2GnT3) or GAPDH (G3PDH, 
supplied by the manufacturer) were performed on a normalized human blood cell cDNA panel 
(MTG from CLONTEGH) for 31 cycles. Expression of G2GnT3 transcript was detected in all 
peripheral blood mononuclear cell (PBMG) fractions with particularly high levels of expression 
in GD4 and GD8 positive T-lymphocytes (Figure 4). 

EXAMPLE 5 

Analysis of DMA polymorphism of the C2GnT3 gene 
Primer pairs such as: 

TSHG123 (5'-GGGGAGGATTTGGGTAGTATG-3^ SEQ ID NO: 10) and 
TSHGl 19 (5*-GATGTCTGATTTGGCTGAGTG-3', SEQ ID NO: 8) 

as described in Figure 5 have been used for PGR amplification of individual sequences of the 
coding exon. Each PGR product was subcloned and the sequence of 10 clones containing the 
appropriate insert was determined assuring that both alleles of each individual are characterized. 



Polymorphism of the amplified DNA can be analyzed using, e.g., DNA sequencing, single-strand 
conformational polymorphism (SSCP) or mismatch mutation. 



From the foregoing it will be evident that, although specific embodiments of the invention have 
been described herein for purposes of illustration, various modifications may be made without 
5 deviating fi*om the spirit and scope of the invention. 
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